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DOCUMENT- IDENTIFIER: US 6529920 Bl 

TITLE: Multimedia linking device and method 



US Patent No. (1) : 
6529920 

Brief Summary Text (6) : 

There are graphical computer playback applications that allow a user to select a 
single point in an audio or video recording (e.g., by positioning a cursor in a visual 
representation of the media) and then type in a keyword (s) (U.S. Pat. Nos . 5,786,814; 
5,717,879; and 5,717,869; and Cruz et al . Capturing and Playing Multimedia Events with 
STREAMS. In Proceedings of ACM Multimedia 1994, pages 193-200. ACM, 1994). As 
described in Degen et al . Working with audio: Integrating personal tape recorders and 
desktop computers . In Proceedings of CHI '92, pages 413-418. ACM, 1992, a user 
manually creates an index or "marker" during recording by pressing one of two buttons 
on a tape recorder and these marks are then displayed graphically. These systems have 
limited utility because they rely on the user to manually index the recordings. 

Brief Summary Text (9) : 

Some systems use handwritten notes taken during recording to index audio or video . 
U.S. Pat. No. 4,841,387 describes a system that indexes tape recordings with notes 
captured during recording. The writing surface is an electronic touchpad. All indices 
are created during recording only and stored in a reserved portion at the beginning of 
a microcassette tape. The user cannot add notes that index the recording during 
playback. The display surface is grouped into rectangular areas to save storage space; 
this has the disadvantage of making the system coarser grained than if each mark or 
pen stroke was indexed. In addition, a user has to put the device in a special "review 
mode" (by pressing a button) before being able to select a location in the notes for 
playback. Other systems index audio and/or video recordings with notes handwritten on 
a computer display screen or electronic whiteboard during recording (U.S. Pat. Nos. 
5,535,063; 5,818,436; 5,786,814; 5,717,879; and 5,717,869, as described in Whittaker, 
Steve et al . Filochat : Handwritten Notes Provide Access to Recorded Conversations. In 
the Proceedings of CHI '94, pages 271-277, ACM-SIGCHI, 1994, and as described in 
Wilcox, Lynn, et al . Dynomite: A Dynamically Organized Ink and Audio Notebook. In the 
Proceedings of CHI '97, pages 186 193, ACM-SIGCHI, 1997). 

Brief Summary Text (16) : 

The present invention describes a multimedia recording device that combines the best 
aspects of a paper notebook and a media recorder (i.e., for recording audio, video, 
music or other time -varying media) . The device can be used to record and index 
interviews, lectures, telephone calls, in-person conversations, meetings, etc. In one 
embodiment, a user takes notes in a paper notebook, and every pen stroke made during 
recording, playback, or while stopped is linked with an audio and/or video recording. 
In other embodiments, the writing medium could be a book, flip chart, white board, 
stack of sheets held like a clip-board, pen computer, etc. (hereinafter referred to as 
a "book") . Hereinafter, the term "page" generically refers to planar surfaces such 
each side of a leaf in a book, the book cover, a surface below the book, a touch 
sensitive surface, a sheet of paper, flip chart, image on a screen, whiteboard, etc. 

Detailed Description Text (9) : 

In one embodiment, the device links notes written in a paper book 17 with a digital 
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audio recording. In other embodiments, the writing medium could be a book, flip chart, 
white board, stack of sheets held like a clip-board, pen computer, etc. 

Detailed Description Text (24) : 

A volume control 417 is provided for adjusting the output level of the sound. This 
volume control 417 does not directly control the volume of the sound output, but 
instead sends a signal to the processor 101 which then instructs the program running 
on the host computer 103 to adjust the volume in software. 

Detailed Description Text (27) : 

FIG. 2 shows some sample data captured for each page of notes for one embodiment of a 
multimedia recorder. There are many different possible sources of data. Some examples 
are shown in the figure: an audio source 440 (speech, music, or other sound from a 
microphone, telephone, radio, etc.), a video source 442 (analog or digital video from 
an internal or external camera, video system, television output, etc.), and a writing 
source 444 (from a paper notebook, book, stack of paper, flip chart, white board, pen 
computer, etc.) . Note that this is not an exhaustive list and there are other possible 
sources of input. 

Detailed Description Text (32) : 

The multimedia recorder 61 captures a complete spatial map of all information written 
on each page. This includes every X-Y point (446, 448) that makes up every stylus 
stroke, as well as a pressure reading 450 from the stylus. The stylus pressure data 
450 is used to determine stylus ups and downs (i.e., when the stylus was placed down 
on a page, and when it was picked up) . The stylus data can be used to very accurately 
render the pages of notes as images on a computer screen and for high quality printing 
of the book pages for filing, faxing, etc. The data can be converted to PostScript, a 
GIF image, or other format. Each X-Y point (446, 448) that makes up each letter, word, 
or drawing acts as an index into the recording. Each X-Y point (446, 448) indexes the 
location in the recording that was being recorded or played back at the time the 
stroke was made, or if the system is stopped, the last portion of the recording that 
was played or recorded prior to stopping. In other embodiments where video is used, a 
video time code could also be stored for every X-Y point (446, 448) . 

Detailed Description Text (74) : 

In one embodiment, the device displays the amount of storage space available in terms 
of hours and minutes of recording time available (rather than the raw number of bytes 
available) . The device takes into account the number of channels to be recorded. For 
example, a two channel recording takes up twice the storage space as a one channel 
recording. This way prior to recording, users know exactly how long they can continue 
to record to the disk (or other storage media ) they are using. 

Detailed Description Text (76) : 

FIG. 5 shows a block diagram of the hardware components for one embodiment of a 
multimedia recorder 61. In one embodiment, the multimedia recorder 61 has a host 
computer 103 that communicates with a variety of other devices. The host computer 103 
contains a central processing unit (CPU) , random access memory (RAM) , serial 
input /output ports, a bus for communicating with other components, etc. The host 
computer 103 can "boot" itself off of an internal or external flash memory unit, or by 
other means . 

Detailed Description Text (77) : 

An audio subsystem 307 communicates to the host computer 103. The audio subsystem 307 
plays and records audio by using analog- to-digital and digital-to-analog converters, 
or a codec (coder-decoder) . The audio subsystem 307 connects to microphone 319 or line 
level inputs (the output of 317), and speaker or line level outputs 321. Microphone 
level inputs 319 can be amplified to line level through the use of an optional 
external pre-amplif ier 317 and connected to the line level input connectors of the 
audio subsystem 3 07. 

Detailed Description Text (78) : 

The host computer 103 reads and writes data to a removable storage unit 3 01 (such as a 
magnetic disk, a magneto-optical disk, or solid state storage module) . The host 
computer 103 communicates to the removable storage unit 301 through a disk controller 
unit 3 03 (such as SCSI) or the disk controller 303 may be built directly into the host 
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computer 103 (such as IDE) . 
Detailed Description Text (79) : 

Stylus 519 location data (e.g., X-Y location, pressure, tilt, etc.) from a digitizing 
tablet 513 (such as with a commercially available tablet from Wacom Technology Corp.) 
is communicated to the host computer 103 through a serial (RS-232) or parallel port. 

Detailed Description Text (81) : 

The host computer 103 communicates with a processor 101 (such as a Motorola MC68HC11) 
that monitors and controls a variety of user interface components 3 09. FIG. 6 shows a 
block diagram of the processor 101 and user interface components 309. The host 
computer 103 and processor 101 communicate through a serial or parallel port. The 
processor 101 reduces the load on the host computer 103 by monitoring the status of 
various user inputs 345 (e.g., buttons, analog inputs, etc.), and sends this 
information to the host computer 103 when there is a change of state in the user input 
345. The processor 101 also translates data from the host computer 103 and displays it 
to the user. The processor 101, for example, can take a text string sent by the host 
computer 103 and cause it to be displayed on the status display 401 by manipulating 
parallel control lines to the display. 

Detailed Description Text (83) : 

The processor 101 also controls visual feedback 337 by receiving commands from the 
host computer 103. The processor 101 drives control lines for the light emitting 
elements in the timeline 532, the record LED 403, the status display 401 and related 
backlight, etc. 

Detailed Description Text (84) : 

Under control from the host computer 103, the processor 101 can send a signal to the 
removable storage drive 301 to eject the storage disk 333 or other medium . 

Detailed Description Text (85) : 

The processor 101 also controls and reads data from the optical sensor used in the 
page identification system 71 and can communicate with a "smart" battery 331 using a 
"system management bus". The processor 101 can thus get status information from the 
battery (e.g., state of charge) and communicate it to the host computer 103. 

Detailed Description Text (86) : 

The processor 101 is programmed in software, and operates in both an interrupt -driven 
and polled manner. Interrupts are generated when incoming communication is received 
from the host computer 103 or when buttons are pushed. In addition, the processor 101 
polls some input devices 345 (e.g., potentiometers, and some switches), and also polls 
the state of charge from the "smart" battery 331 and the page identification subsystem 
343 . 

Other Reference Publication (6) : 

"The World Through The Computer: Computer Augmented Interaction With Real World 
Environments"; Rekimoto et al . ; UIST'95 Nov. 14-17, 1995: pp. 29-36. 

Other Reference Publication (9) : 

"Voice Communication With Computers Conversational Systems"; Schmandt; 1994; Chap. 4 & 
12. 

Other Reference Publication (14) : 

L. Degen, R. Mander and G. Salomon. Working with audio: Integrating personal tape 
recorders and desktop computers . In Proceedings of CHI '92, pp.- 413-418. ACM, 1992. 
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Brief Summary Text (10) : 

Systems described in Stifelman, Lisa J. Augmenting Real -World Objects: A Paper-Based 
Audio Notebook. In the Proceedings of CHI '96. ACM-SIGCHI , 1996 ("Stifelman 1996") and 
Stifelman, Lisa J. The Audio Notebook: Paper and Pen Interaction with Structured 
Speech. Doctoral Dissertation. Massachusetts Institute of Technology, Sep. 1997 
("Stifelman 1997") index digital audio recordings with notes written in a paper 
notebook during recording. Some limitations of these systems are as follows. Like the 
previous systems just described, Stifelman (1996) and Stifelman (1997) focused on 
real-time indexing- -a limitation is that notes written during playback do not index 
the recording. A further problem is the issue of distinguishing writing activity from 
selections made for playback. In Stifelman (1996) and Stifelman (1997) , if a user adds 
to their notes when in a play mode, this could falsely trigger a playback selection. 
Also, selections left visible marks on the pages. With systems that use a display 
screen as the writing surface instead of paper, sometimes a circling gesture or other 
gesture is used to select areas of writing for playback. This can also be error-prone 
because the system has to distinguish between a circle drawn as data versus a circling 
gesture or else the user must put the system in a special mode before making the 
gesture, causing selection to be a two-step procedure. 

Brief Summary Text (16) : 

The present invention describes a multimedia recording device that combines the best 
aspects of a paper notebook and a media recorder (i.e., for recording audio, video, 
music or other time -varying media) . The device can be used to record and index 
interviews, lectures, telephone calls, in-person conversations, meetings, etc. In one 
embodiment, a user takes notes in a paper notebook, and every pen stroke made during 
recording, playback, or while stopped is linked with an audio and/or video recording. 
In other embodiments, the writing medium could be a book, flip chart, white board, 
stack of sheets held like a clip-board, pen computer, etc. (hereinafter referred to as 
a "book") . Hereinafter, the term "page" generically refers to planar surfaces such 
each side of a leaf in a book, the book cover, a surface below the book, a touch 
sensitive surface, a sheet of paper, flip chart, image on a screen, whiteboard, etc. 

Brief Summary Text (17) : 

For playback, users can cue a recording directly to a particular location simply by 
turning to the corresponding page of notes. An automatic page identification system 
recognizes the current page, making it fast and easy to navigate through a recording 
that spans a number of pages of data. Users can select any word, drawing, or mark on a 
page to instantly cue playback to the time around when the mark was made. A selection 
is made using a "stylus", where stylus is defined as a pen (either the writing end of 
a digitizing pen or the selecting end of a digitizing pen ) , finger, or other pointing 
mechanism. The multimedia recorder is able to reliably distinguish between user 
notations that index the recording and selections intended to trigger playback. 

Detailed Description Text (9) : 

In one embodiment, the device links notes written in a paper book 17 with a digital 
audio recording. In other embodiments, the writing medium could be a book, flip chart, 
white board, stack of sheets held like a clip-board, pen computer, etc. 

Detailed Description Text (11) : 

The cover and each page of the book 17 has a printed page identification code 43. A 
code is also printed on the surface of the multimedia recorder 61. In one embodiment, 
the cover code is used to detect whether a book 17 is on the device, and the page code 
43 identifies the page number. In other embodiments, these codes could be used to 
store other kinds of information about the book or page (e.g., a code that uniquely 
identifies the book as well as the page, the format of a page, etc.) . 

Detailed Description Text (24) : 

A volume control 417 is provided for adjusting the output level of the sound. This 
volume control 417 does not directly control the volume of the sound output, but 
instead sends a signal to the processor 101 which then instructs the program running 
on the host computer 103 to adjust the volume in software. 

Detailed Description Text (27) : 

FIG. 2 shows some sample data captured for each page of notes for one embodiment of a 
multimedia recorder. There are many different possible sources of data. Some examples 
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are shown in the figure: an audio source 44 0 (speech, music, or other sound from a 
microphone, telephone, radio, etc.), a video source 442 (analog or digital video from 
an internal or external camera, video system, television output, etc.), and a writing 
source 444 (from a paper notebook, book, stack of paper, flip chart, white board, pen 
computer, etc.). Note that this is not an exhaustive list and there are other possible 
sources of input. 

Detailed Description Text (32) : 

The multimedia recorder 61 captures a complete spatial map of all information written 
on each page. This includes every X-Y point (44 6, 448) that makes up every stylus 
stroke, as well as a pressure reading 450 from the stylus. The stylus pressure data 
450 is used to determine stylus ups and downs (i.e., when the stylus was placed down 
on a page, and when it was picked up) . The stylus data can be used to very accurately 
render the pages of notes as images on a computer screen and for high quality printing 
of the book pages for filing, faxing, etc. The data can be converted to PostScript, a 
GIF image, or other format. Each X-Y point (446, 448) that makes up each letter, word, 
or drawing acts as an index into the recording. Each X-Y point (446, 448) indexes the 
location in the recording that was being recorded or played back at the time the 
stroke was made, or if the system is stopped, the last portion of the recording that 
was played or recorded prior to stopping. In other embodiments where video is used, a 
video time code could also be stored for every X-Y point (446, 448) . 

Detailed Description Text (54) : 

FIG. ID shows two possible embodiments of a stylus 519. In one embodiment, a 
selection-end 531 of a stylus 519 is pressed down on or pointed at a desired location 
on a page. The two embodiments of the stylus 519 shown in figure ID from left to right 
are as follows: a stylus 519 with an ink tip on the writing-end and a button on the 
selection-end 531, and a stylus 519 with an ink tip on the writing-end 529 and a 
non-marking tip on the selection-end 531. In these embodiments, the selection-end 531- 
of the stylus is used to trigger playback from a location on a page. One stylus can be 
used for both writing and selecting without getting any unwanted pen marks on the 
page. These embodiments allow for easy distinction between writing and 
selecting- -whenever the writing-end 529 of the stylus 519 makes contact with the page, 
this is considered writing activity; whenever the selection-end 531 of the stylus 519 
makes contact with the page, this is considered selecting. Note that the device can 
sense the stylus 519 location without making contact with the page. In one embodiment, 
a pressure reading from the stylus 519 is used to determine if a selection action has 
been made (i.e., the pressure is above a threshold). Another advantage of this design 
is that is it easily discoverable and learned by the user. In another embodiment, a 
single-ended stylus (not shown) can be used where a button on the stylus switches 
between a writing function and a selectionf unction. In yet another embodiment, the 
system can distinguish between a writing stylus and a selectionstylus using an 
identifier communicated from the stylus 519 to the multimedia recorder 61. 

Detailed Description Text (74) : 

In one embodiment, the device displays the amount of storage space available in terms 
of hours and minutes of recording time available (rather than the raw number of bytes 
available) . The device takes into account the number of channels to be recorded. For 
example, a two channel recording takes up twice the storage space as a one channel 
recording. This way prior to recording, users know exactly how long they can continue 
to record to the disk (or other storage media ) they are using. 

Detailed Description Text (76) : 

FIG. 5 shows a block diagram of the hardware components for one embodiment of a 
multimedia recorder 61. In one embodiment, the multimedia recorder 61 has a host 
computer 103 that communicates with a variety of other devices. The host computer 103 
contains a central processing unit (CPU) , random access memory (RAM) , serial 
input/output ports, a bus for communicating with other components, etc. The host 
computer 103 can "boot" itself off of an internal or external flash memory unit, or by 
other means . 

Detailed Description Text (77) : 

An audio subsystem 307 communicates to the host computer 103. The audio subsystem 307 
plays and records audio by using analog- to-digital and digital-to-analog converters, 
or a codec (coder-decoder) . The audio subsystem 307 connects to microphone 319 or line 
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level inputs (the output of 317), and speaker or line level outputs 321. Microphone 
level inputs 319 can be amplified to line level through the use of an optional 
external pre-amplif ier 317 and connected to the line level input connectors of the 
audio subsystem 307. 

Detailed Description Text (78) : 

The host computer 103 reads and writes data to a removable storage unit 301 (such as a 
magnetic disk, a magneto-optical disk, or solid state storage module) . The host 
computer 103 communicates to the removable storage unit 301 through a disk controller 
unit 303 (such as SCSI) or the disk controller 303 may be built directly into the host 
computer 103 (such as IDE) . 

Detailed Description Text (79) : 

Stylus 519 location data (e.g., X-Y location, pressure, tilt, etc.) from a digitizing 
tablet 513 (such as with a commercially available tablet from Wacom Technology Corp.) 
is communicated to the host computer 103 through a serial (RS-232) or parallel port. 

Detailed Description Text (81) : 

The host computer 103 communicates with a processor 101 (such as a Motorola MC68HC11) 
that monitors and controls a variety of user interface components 309. FIG. 6 shows a 
block diagram of the processor 101 and user interface components 309. The host 
computer 103 and processor 101 communicate through a serial or parallel port. The 
processor 101 reduces the load on the host computer 103 by monitoring the status of 
various user inputs 345 (e.g., buttons, analog inputs, etc.), and sends this 
information to the host computer 103 when there is a change of state in the user input 
345. The processor 101 also translates data from the host computer 103 and displays it 
to the user. The processor 101, for example, can take a text string sent by the host 
computer 103 and cause it to be displayed on the status display 401 by manipulating 
parallel control lines to the display. 

Detailed Description Text (83) : 

The processor 101 also controls visual feedback 337 by receiving commands from the 
host computer 103. The processor 101 drives control lines for the light emitting 
elements in the timeline 532, the record LED 403, the status display 401 and related 
backlight, etc. 

Detailed Description Text (84) : 

Under control from the host computer 103, the processor 101 can send a signal to the 
removable storage drive 301 to eject the storage disk 333 or other medium . 

Detailed Description Text (85) : 

The processor 101 also controls and reads data from the optical sensor used in the 
page identification system 71 and can communicate with a "smart" battery 331 using a 
"system management bus". The processor 101 can thus get status information from the 
battery (e.g., state of charge) and communicate it to the host computer 103. 

Detailed Description Text (86) : 

The processor 101 is programmed in software, and operates in both an interrupt-driven 
and polled manner. Interrupts are generated when incoming communication is received 
from the host computer 103 or when buttons are pushed. In addition, the processor 101 
polls some input devices 345 (e.g., potentiometers, and some switches), and also polls 
the state of charge from the "smart" battery 331 and the page identification subsystem 
343 . 

Other Reference Publication (6) : 

"The World Through The Computer: Computer Augmented Interaction With Real World 
Environments"; Rekimoto et al . ; UIST'95 Nov. 14-17, 1995: pp. 29-36. 

Other Reference Publication (9) : 

"Voice Communication With Computers Conversational Systems"; Schmandt; 1994; Chap. 4 & 
12 . 

Other Reference Publication (14) : 

L. Degen, R. Mander and G. Salomon. Working with audio: Integrating personal tape 
recorders and desktop computers . In Proceedings of CHI '92, pp. 413-418. ACM, 1992. 
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Other Reference Publication (23) : 

Stifelman, Lisa J. The Audio Notebook: Paper and Pen Interaction with Structured 
Speech. Doctoral Dissertation. Massachusetts Institute of Technology, Sep. 1997. 



5 of 5 



5/30/03 8:21 PM 



